add new host injection variant symlinks for 2025.06#263
add new host injection variant symlinks for 2025.06#263bedroge wants to merge 9 commits intoEESSI:mainfrom
Conversation
003ac66 to
96c108e
Compare
| #versions/2025.06/compat/linux/aarch64/lib/amd: '$(EESSI_202506_AMD_OVERRIDE:-/cvmfs/software.eessi.io/defaults/amd)' | ||
| #versions/2025.06/compat/linux/riscv64/lib/amd: '$(EESSI_202506_AMD_OVERRIDE:-/cvmfs/software.eessi.io/defaults/amd)' | ||
| #versions/2025.06/compat/linux/x86_64/lib/amd: '$(EESSI_202506_AMD_OVERRIDE:-/cvmfs/software.eessi.io/defaults/amd)' |
There was a problem hiding this comment.
I've commented these out for now, as we won't be using them yet. On the other hand, it doesn't do any harm to already create them?
| # defaults/amd: '$(EESSI_AMD_OVERRIDE_DEFAULT:-/dev/null)' | ||
| defaults/nvidia: '$(EESSI_NVIDIA_OVERRIDE_DEFAULT:-/dev/null)' | ||
| defaults/override: '$(EESSI_LIB_OVERRIDE_DEFAULT:-/dev/null)' | ||
| host_injections: '$(EESSI_HOST_INJECTIONS:-/opt/eessi)' |
There was a problem hiding this comment.
Not to muddy things too much here, but I wonder if this too should also be pointing to /dev/null by default? It is a bit of a security hole as you can inject into MPI binaries via locations in there and this is not explicitly obvious (the default doesn't appear in your local configuration so you would need to actively know that you should be monitoring that).
There was a problem hiding this comment.
Granted, for it to be a problem you have to manually create /opt/eessi and lose control of that directory, but if you did that, there's no real record in your CVMFS setup about that.
There was a problem hiding this comment.
The issue I see here is that it would actively break installations that have relied on the default value so far. Also, this is only an issue for EESSI 2023.06, not for 2025.06 and newer: the latter only search the following paths:
Shared library search path:
(libraries located via /cvmfs/software.eessi.io/versions/2025.06/compat/linux/x86_64/etc/ld.so.cache)
/cvmfs/software.eessi.io/versions/2025.06/compat/linux/x86_64/lib64 (system search path)
/cvmfs/software.eessi.io/versions/2025.06/compat/linux/x86_64/usr/lib64 (system search path)
/cvmfs/software.eessi.io/versions/2025.06/compat/linux/x86_64/lib/override (system search path)
/cvmfs/software.eessi.io/versions/2025.06/compat/linux/x86_64/lib/nvidia (system search path)
/cvmfs/software.eessi.io/versions/2025.06/compat/linux/x86_64/lib/amd (system search path)
The only thing that's still used from the original host_injections is
/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/.lmod/SitePackage.lua
which is including local SitePackage.lua's. I'm not saying that's harmless, and not saying there's no good reason to change that default, but:
- I wouldn't do it in this PR (it could hold this PR - which is a requirement for CUDA support in 2025.06 - back)
- I would accompany it with a fair amount of effort to inform sites that may rely on this default. I'm thinking: clear and explicit change in the documentation on how
host_injectionsbehaves, and how that changes in the future. Also: broadcast on Slack. Also: see point (4). - Maybe we should even allow for a transition period, to allow sites to explicitely set a non-default
host_injectionsin their CVMFS config before we push this change. - We COULD consider trying to (ab)use SitePackage.lua to warn people about the incoming change. E.g. you could check
a) is theirhost_injectionsresolving to/opt/eessi
b) do they have anything (any files/dirs) in that subdirectory
And if A & B are both true, print a warning with every module command being run to inform the site that per date XYZ they will have to set the variant symlink explicitely. One downside is: it's not easy to check ifhost_injectionsresolves to/opt/eessibecause that's the default or because the site set that explicitely in their CVMFS config... acvmfs_config showconfig software.eessi.iodoes not show values for variant symlinks - even if they are explicitely set in the config - so we cannot easily see where the value comes from.
See EESSI/software-layer-scripts#158.